DEG. Each of such a case (or control) expression will be merged
ght cluster. Therefore the tight cluster will be adaptively updated
ng its size when scanning the case expressions. All case
ns will be scanned one by one till the point where a case
n cannot be treated as a member of the tight cluster and its
hip of the tight cluster is significantly denied. In other words, the
process of the algorithm is terminated when the remaining case
ns, which are treated as outliers, cannot be merged to the tight
An alternative to allow an outlier to present among the control
ns in DOG is to use the 9th percentile of the control expressions
n initial tight cluster [Yang and Yang, 2013]. For a DEG, most
ressions will be significantly deviated from a tight cluster
ed based on the control expressions. Therefore, a process of
g for an outlier using the tight cluster approach will be terminated
early stage.
common that there is normally an overlap between the control
ns and the case expressions for a gene. Therefore, DOG has been
pdated in this chapter. Rather than using the 9th percentile of the
xpressions, a new initial tight cluster is formed in a slightly
method. With this method, an initial tight cluster is formed in the
g way. The following equation defines an empirical standard
, where the coefficient 1.4826 was used in the COPA algorithm
, et al., 2005] and ߤ stands for the median of all expressions,
ߪොൌ1.4826 ൈmedianሺ|ܠെߤ|ሻ
(6.16)
e 6.17 shows the relationship between the true (or expected)
deviations (ߪ) and the estimated standard deviations (ߪො) using the
uation. The correlation between two sets of standard deviations
t 0.97 meaning that an estimated standard deviation can be a very
roximate to the true or expected standard deviation and can be
n initial standard deviation to start a tight cluster in the algorithm.